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ABSTRACT 


A  variable  span  scatterplot  smoother  based  on  local  linear  fits  is 
described.  Loral  cross-validation  is  used  to  estimate  the  optimal  span 
as  a  function  o?  abscissa  value.  A  rejection  rule  is  suggested  to  make 
the  smoother  resistant  against  outliers.  Computationally  efficient 
algorithms  making  us.j  of  updating  formulas  and  corresponding  FORTRAN 
subroutines  are  presented. 
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1 •  Introduction 

A  smoother  is  a  procedure  that  operates  on  bivariate  data 
Cxi  ,yO  . . .  (xn,yn)  and  produces  a  decomposition 

y;  =  s(xi)  +  ri,  i=l ...n.  (1) 

Here  s  is  a  smooth  function,  often  simply  called  t*-t  vtvroth,  and  the  ri 
are  residuals.  It  is  possible  to  formally  define  . <a t  constitutes  a 
smooth  function,  and  to  define  measures  of  smoot  ess,  but  for  our 
purposes  an  intuitive  notion  will  be  sufficient.  soothers  are  used  to 
summarize  the  association  between  the  predict'*  variable  X  and  the 
response  Y.  It  was  pointed  out  by  Cleveland  (*'.->  and  is  a  commonly 
held  belief,  that  when  looking  at  a  scatterplot  the  eye  is  distracted  by 
the  extreme  points  in  the  point  cloud,  i.e.,  *,  e  fuzzy  background,  and 

tends  to  miss  structure  in  the  bulk  of  the  Augmentation  of  the 

plot  by  a  smooth  is  a  possible  remedy.  More  formally,  smoothers  can  be 
regarded  as  curve  estimators;  one  assumes  chat  the  response  was  generated 
by  adding  random  noise  to  a  smooth  function: 

y ;  =  f (x i )  +  e  i  (2) 

and  considers  the  smooth  s  as  an  estimate  for  f. 

Recently  scatterplot  smoothers  have  found  a  new  use  in  multiple 
nonparametric  regression  (Friedman  and  Stuetzle,  1981).  Let 
(xi ,yi ) . . > (xn,vn;  denote  the  observations;  x\  here  i3  a  vector  in  RP,  not 
just  a  single  number,-  Assume  as  above  that  yi  ~  f  ( x  i )  +  tj,  i  =  !...n. 
Projection  pursuit  regression  constructs  an  estimate  m  for  f  of  the  form 

M 

m(x)  =  £  Sj (a, . x) , 

i  =  1 


T 
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where  the  <m  and  suitably  chosen  unit  vectors  in  RP.  For  given  ai,  s* 
(essentially)  is  found  by  smoothing  the  scatterplet  of  the  residuals 
i  - 1 

r-ji-f  -  yj-J  sk(«lc.Xj)  versus  a,.Xj.  The  smoother  described  in  this 
k  =  1 

report  is,  up  to  minor  modifications,  the  one  used  in  the  current 
projection  pursuit  regression  procedure., 

2 .  Basic  Concepts 

According  to  the  definition  above,  any  procedure  that  parses  a  smooth 
curve  thrjugh  a  scatterplot,  for  example  a  procedure  that  fits  a  least 
squares  straigh1  line,  would  be  called  a  smoother.  This  is  not  quite 
what  we  have  in  mind.  Assume  the  data  are  generated  according  to  (2). 
Ue  are  interested  in  procedures.  That  can  approximate  f  arbitrarily 
closely,  given  a  dense  enough  sample,  without  any  conditions  on  f  apart 
from  f  being  smooth.  Such  procedures  can  be  based  on  local  averaging. 
Take  s(xj)  to  be  the  average  of  the  responses  for  those  observations  with 
predictor  values  in  a  neighborhood  N  of  x  : 

s(xi>  -  ave(yj|xj€  I).  (3) 

Here  "aveK  can  stand  for  the  arithmetic  mean,  the  median,  or  more 
complicated  ways  o*  averaging  to  be  discussed  below.  A  critical 
parameter  to  be  chosen  is  the  SPAN,  the  size  of  the  neighborhood  over 
which  averaging  takes  place..  It  controls  the  smoothness  of  s.;  The 
bigger  the  span,  the  smother  s  will  be..  To  obtain  consistency,  i,e.,  to 
make  sure  that  s  gets  arbitrarily  close  to  f  as  the  sampling  rate 
increase*-.-  one  must  shrink  the  diameter  of  the  neighborhood  in  such  a  way 
that  the  number  of  observations  in  the  neighborhood  still  grows  to 
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infinity.  Shrinking  the  neighborhood  makes  the  systematic  or  bias 
component  in  the  estimation  error  diminish,  while  increasing  the 
neighborhood  sample  size  guarantees  that  the  variance  component  of  the 
error  goes  to  zero  as  well. 

An  alternative  method  for  nonparametr ic  curve  estimation  is  based  on 
series  expansions:  make  an  ansatz  for  s  of  the  form 

M 

s(xx)  =  l  a i g  i  ( x k > 
i  =  1 

where  the  g i  ( x )  can,  for  example,  be  polynomials  or  trigonometric 
functions.  The  constants  aj  are  then  determined  by  fitting  the  series  to 
the  data,  most  commonly  by  least  squares.  The  role  of  the  span  is  played 
here  by  M,  the  number  of  terms  included  in  the  model.;  Trigonometric 
functions  have  been  used  with  success  in  cases  where  the  signal  is 
naturally  periodic.  If  the  abscissas  Xj  are  equi-spaced,  the  fit  is 
particularly  inexpensive  to  compute  using  thu  Fast  Fourier  Transform. 
Both  conditions  are  usually  not  fulfilled  in  the  case  of  scatterplot 
smoothing,  making  the  method  less  attractive.-  The  use  of  polynomials  has 
the  drawback  that  they  are  not  well  suited  to  represent  a  wide  variety  of 
commonly  encountered  functions,  for  example,  functions  with  asymptotes. 

There  are,  of  course,  connections  between  smoothing  by  senes 
expansion  and  smoothing  by  local  averaging.  If  the  series  is  fitted  by 
least  squares,  the  fitted  values  s(xj)  are  weighted  averages  of  the 
responses  yj.  Depending  on  the  abscissas  and  the  functions  g;(x),  the 
weights  determining  s(x,)  might  or  might  not  be  concentrated  on  responses 
with  corresponding  predictor  values  close  to  x,.  If  they  are,  the  series 
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expansion  method  behaves  like  a  local  averaging  method..  An  example  of 
this  is  least  squares  fit  of  cubic  splines  which  will  be  further 
discussed  in  Section  9. 

3.,  A  Simple  Nonres i stant  Smoother 

The  simplest  example  for  a  local  averaging  type  smoother  is  the  moving 
average,  uhere  "ave”  in  equation  (3)  denotes  the  arithmetic  mean.  The 
size  of  the  neighborhood  is  usually  specified  by  the  span,  the  number  k 
of  observations  to  be  included  in  the  averaging.  We  Mill  assume  k  to  be 
odd  and  the  abscissas  x;  to  be  in  increasing  order.  The  neighborhood  can 
be  chosen  either  symmetrically,  containing  k/2  observations  to  the  left 
of  x,-  and  the  same  number  to  the  right,  or  it  can  be  chosen  to  contain 
the  k  nearest  neighbors  of  Xj,  including  x*.  (We  assume  that  k/2  is 
computed  by  integer  division.)  There  are  no  general  results  on  which  of 
thes>  two  possibilities  is  better  statistically.-  The  nearest  neighbors 
approach  generalizes  to  higher  dimensions,  but  the  choice  of  a  symmetric 
neighborhood  is  computationally  simpler  in  that  exactly  one  point  enters 
and  one  point  leaves  the  neighborhood  as  one  moves  from  observation  i  to 
observation  i+i,.  We  will,  in  the  following,  use  symmetric  neighborhoods.- 
The  boundaries,  where  it  is  not  possible  to  keep  N  symmetric,  nave  to  be 
treated  specially;  a  commonly  used  adjustment  is  to  shrink  the 
neighborhood  so  that  for  i=1  and  i=n,  one  averages  only  over  k/2+1 
observations.  With  these  conventions,  the  moving  average  smoother  is 
defined  by 

s(x;>  ;  mean ( y 3 | max ( i - k/2 , i )  .<  j  {  mint i+k/2,n> . 

Obv'ously,  the  mean  does  not  have  to  be  recomputed  every  time.-  It  can  be 
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updated,  reducing  the  computation  from  nk  to  n.  Such  updating  can  be 
done  for  all  the  smoothers  we  will  consider,  and  is  highly  desirable 
because  in  typical  apolications  k  is  5‘/.  10  30M  of  n,  and  thus  the  savings 
are  substantial.  The  simple  moving  average  smoother  has  some  serious 
shortcomings.;  One  disturbing  property  is  that  it  does  not  reproduce 
straight  1  v<js  if  the  abscissa  values  ore  not  equi-spaced.:  Another 
disturbing  feature  is  the  bad  behaviour  at  the  boundaries..  If,  for 
example,  the  slope  of  the  underlying  function  f  is  positive  at  the  right 
boundary.,  the  estimate  for  observations  close  to  the  boundary  will  be 
biased  downwards;'  if  the  slope  is  negative,  the  estimate  is  biased 
upwards..  Both  problems  can  be  alleviated  by  fitting  a  least  squares 
straight  line  L  to  the  observations  in  the  neighborhood  instead  of 
fitting  a  constant  and  taking  the  value  of  the  line  at  x,  as  the  smoothed 
value..  This  obviously  solves  bo^h  problems  mentioned  above.  For  the 
computation,  again  updating  formulas  can  be  used.  The  slope  B  and 
intercept  a  of  the  least  squares  straight  line  through  a  set  of  points 
( x i , y i ) . . . ( x n , y m )  are  given  by 


a  =  Vm  - 


with 


C  m 

a  =  — 

v™ 

Xm  -  I  Xi'/m, 

Pit  =  Z  V  i/m. 

Cm  ~  1  (x ,-x,Hy  i-y„)  • 
V*  -  }'  (xt-x„)'.. 


When  ue  want  to  add  an  observation  (x.»i>y.*i)>  ue  can  make  use  of  the 
following  easily  derived  formulas: 

x.+  i  =  (mx.  +  x**i)  /  (ra+1), 

V«*i  =  (my.  +  y.ti)  /  (m+1), 
m+1 

C«*1  =  c.  +  -  (x.»i  -  x„*i)(y«*i  - 

m 

m+ 1 

V„+i  =  V at  +  (x.+  i  ~  x .4 i ) 2 . 

m 

Analogous  formulas  can  obviously  be  used  for  removal  of  an  observation 
from  the  set. 

4.  Choice  of  Span 

The  most  important  choice  in  the  use  of  a  local  averaging  smoother  is 
the  choice  of  the  span  value.  If  the  smoother  is  regarded  as  a  curve 
estimator,  then  the  span  controls  the  trade  off  between  bias  and  variance 
of  the  estimate.  Ue  illustrate  this  for  the  case  of  a  simple  moving 
average  smoother,.;  In  this  case,  the  smoothed  value  at  point  Xj  is  given 
by 

1  i+k/2 

six,)  =  -  £  y  3 . 

k  l  -  k/2 

If  we  assume  that  the  errors  «;  are  l.i.d..  with  experted  value  zero  and 
variance  o2,  then  the  expected  squared  error  at  point  x,  is 

1  i+k/2  1 

ESC  ( x  i  |  k )  =  (fix;)  -  -  l  f  I  x  j )  )2  +  -  o2.;  (4) 

k  i-k/2  k 

Increasing  the  span  will  (if  d2f/,x2*0)  increase  the  f.rst  term,  the  bias 


component  of  the  estimation  error  and  decrease  the  second  term,  the 
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variance  component;  decreasing  the  span  will  have  the  opposite  effect. 
The  span  should  be  chosen  such  that  both  compnents  of  the  error  are 
reasonably  balanced..  Stated  more  geometrically,  a  larger  span  makes  the 
smooth  appear  less  Wiggly  by  more  strongly  damping  high  frequency 
components  of  the  senes  (x,.yj). 

We  have,  so  far,  said  nothing  useful  on  how  to  choose  the  span  in 
practice.  The  advice  given  above  on  balancing  bias  and  variance  is  not 
very  helpful  because  both  f  and  the  variance  of  the  random  error  are 
unknown 


One  can  estimate  the  opcmal  span  value  in  a  particular  situation  as 
that  value  that  minimizes  an  estimate  for  the  integrated  squared  error 


1 2  ( k  J 


r 

!  ESE ( x | k )  dF ( x  ) . 


Using  the  average  squared  residual  of  the  data  from  the  smooth 
1  n 

i 2 ( k )  =  -  £  [y,-s(x,|k)]2 

n  i  =  1 

for  this  purpose  is  not  appropriate  since  this  is  always  minimized  by  the 
span  value  k=l.  A  better  estimate  is  provided  by  a  method  referred  to  as 
"cross-val ida!  on"  01,  Stone,  1974)  or  "predictive  sample  reuse" 
(Geisser,  1975)  Each  observation  is  in  turn  deleted  and  the  value  of 
the  smooth  s(  ,,(x,|k)  at  x,  is  calculated  from  the  other  n- 1 
observations.  The  cross-val ida ted  estimate  of  the  integrated  square 
error;  is 

1  n 

I  2  c  v  f  k  ;  -  -  }  [  y  ,  -  s ,  ,  ,  (  x  ,  j  k  )  ]  2  .  1  5 ) 

n  1-1 
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Clearly,  Etl2CvJ  equals  the  expected  squared  error  obtained  by  applying 
the  procedure  to  a  sample  of  n-1  observations  from  the  same  dist>  ibution. 
The  cross-vali dated  estimate  for  the  optimal  span  value  is  taken  to  be 
the  value  kCv  that  minimizes  (5), 

kcv  =  min' 1  i2Cv  (k) . 

0<k  SN 

Model  selection  through  cr oss-val i dat .on  has  been  remarkably  successful 
in  a  wide  variety  of  situations  (see  M.  Stone,  1974,  Geisser,  1975, 
Craven  and  Wahba,  1957,  C.  Stone,  1981). 

For  the  moving  average  smoothers  discussed  in  the  previous  section, 
the  deleted  smooth  estimates  s  <  i  1 1  x  1 1 k )  are  especially  easy  to  compute; 
each  observation  is  simply  deleted  from  the  neighborhood  used  to  compute 
its  local  straight  line  fit..  Again,  the  use  of  updating  formulas  makes 
this  computation  very  rapid.  As  one  moves  from  observation  i  to  1+1, 
exactly  two  observations  enter  the  neighborhood  (i  and  i+k/2+1)  and 
exactly  two  leave  it  Ci+1  and  i-k/2).  The  (deleted)  residual  squared 

rz( i)  =  Cy i  -  ?<  i)  (xi|k)]2  (6) 

is  computed  for  each  observation  and  then  averaged  over  all  observations, 
1  n 

I2cv  (k)  =  -  l  r*<  i(.  (7) 

n  i  =  1 

For  small  to  moderate  changes  in  k,  l2cv(k)  changes  very  little  so  that 
it  is  adequate  to  evaluate  it  for  several  (4  to  7)  discrete  values  of  k 
in  the  range  [0  <  k/n  <  1J.  The  value  of  k  corresponding  to  the  smallest 
of  these  values  is  then  used.  This  can  be  accomplished  by  maintaining 
several  running  average  smoothers  -  one  for  each  span  value  -  in  the  pass 
over  the  data,  thus  keeping  the  compute* ionai  cost  linear  in  n.. 
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So  far,  ue  have  been  assuming  that  the  span  is  constant  over  the  whole 
range  of  predictor  values.  This  is  not  optimal  if  either  the  variance  of 
the  random  component  or  the  second  derivative  of  the  underlying  function 
f  change  over  the  range  of  predictor  values..  A  local  increase  in  error 
variance  would  call  for  an  increase  in  span,  whereas  an  increase  in 
second  derivative  of  f  would  require  a  decrease.;  It  is,  therefore, 
desirable  to  allow  the  span  value  to  adapt  to  these  changing  conditions. 
This  requires  that  the  optimal  span  value  be  choosen  locally  rather  than 
choosing  a  single  global  value,.  Again,  the  form  of  moving  average 
smoothers  make  this  especially  easy;  the  (deleted)  residual  squared  (6) 
-for  each  of  the  several  k  values-  is  averaged  loca'ly  in  a  neighborhood 
of  each  observation 

1  i+L/2 

iZcv  ( k ;  x  i )  =  -  £  r 2  (  a  ,  (  xjj  I  k )  (8) 

L  £=i-L/2 

rather  than  globally  over  all  observations  (7).  Note  that  (8)  also  has 
the  form  of  a  simple  moving  average  smoother  and  can  therefore  bo 
computed  rapidly  through  the  use  of  updating  formulas.  The  value  that 
minimizes  (8) 

k c v ( *  i  1  =  m n 1  I2cv(k;x,)  (9) 

0<k<N 

is  the  span  value  used  tor  the  l  th  observation.; 

Most  often  the  shape  of  ’2CV  (k;x;)  near  •  ts  minimum  value  is  shallow 
and  asymmetric,  increasing  more  slowly  in  the  direction  of  smaller  k 
values.;  Variability  in  the  estimate  i2Cv<  therefore,  causes  kcv  to  be 
highly  variable  and  biased  toward  smaller  values.  Although  this  has 
little  effect  on  the  quality  of  the  resulting  smooth  in  terms  of  expected 
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squared  error  (ESE),  it  does  effect  its  aesthetic  quality  since,  fo*' 
comparable  ESE,  the  less  smooth  solution  tends  to  be  selected.  This  can 
be  remedied  by  forcing  the  procedure  to  take  the  smoothest  solution  in 
these  circumstances..  Specifically,  the  largest  span  value  k*cv  for  which 

i2cv(k*cv:xi)  <  (1+a)  min  1 2  cv(k ; x  i  ^  (10) 

0<k<N 

is  used  for  the  ith  observation.-  Here  o.  loosely  controls  an  upper  limit 
on  the  fraction  of  ESE  that  is  to  be  sacrificed  for  the  goal  of 
smoothness.  Values  in  the  range  0.05  <  a  i  0.2  are  reasonable. 

Since  the  optimal  span  value  is  estimated  separately  for  each 
observation,  its  size  can  vary  substantially  over  the  range  of  predictor 
values.  However,  since  for  close  abscissa  values  the  neighborhoods 
overlap  considerably,  this  variation  is  constrained  to  be  smooth.  The 
degree  of  smoothness  is  controlled  by  the  parameter  L  (8)  which  can  be 
regarded  as  a  span  for  smoothing  the  (deleted)  residuals  squared  from  the 
original  smooths.;  As  uith  the  original  smoother,  its  optimal  value  can 
be  estimated  by  cross-validation.:  To  the  extent  that  the  variation  of 
the  se--  nd  der-vative  of  f  or  the  variation  in  the  random  component  is 
comparable  to  the  variation  of  f  itself,  this  second  level  of  cross- 
validation  may  be  beneficial.  Again,  updating  formulas  make  this 
relatively  inexpensive..  However,  in  most  circumstances  chu  -sing  a 
nominal  value  for  l  (0  2n  to  0.3n)  is  adequate. 

It  is  important  to  note  that  using  cross-validated  residuals  as  a 
basis  for  choosing  span  value  is  highly  sen,itive  to  lack  of  independence 
among  the  «,  (2)  as  ordered  on  x.  If  there  is  a  large  positive 


(nega  1 1 ve ) 


correlation  among  obset v a t i ons 


with  similar  x  values, 
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substantial  under  (over)  estimates  will  result..  In  situations  where  a 
high  degree  of  auto-correl ation  is  suspected,  those  span  selection 
procedures  should  be  used  with  caution. 

Figure  1  illustrates  the  application  of  this  smoothing  algorithm  to  an 
artificial  data  set.  (A  FORTRAN  subroutine  implementing  vhis  algorithm 
is  listed  in  Appendix  1,)  The  data  for  this  examr/le  consists  of  M=200 
pairs  (xi,yi>  with  the  Xj  drawn  randomly  (lid)  from  a  uniform 
distribution  in  the  interval  [0,1].  The  y;  are  obtained  from 
y  i  =  sin[2n(  1-x  f ) 2]  +  x  i e  i 

with  the  fj  iid  standard  normal.  The  parameter  ALPHA  [a  in  (10)]  was  set 
to  0.1  and  RESPAN  [L/n  in  (8)]  was  set  to  0.25  (see  Appendix  1).  This 
example  simulates  a  situation  in  which  both  the  curvature  of  f  decreases 
and  the  variance  of  the  random  component  increases  with  increasing  x. 
Figure  la  is  a  scatterplot  of  the  simulated  data.  Figure  1b  also  shows 
this  scatterplot,  but  with  the  resulting  smooth  super  in. posed .  The  height 
of  the  curve  near  the  bottom  indicates  the  span  value  chosen  at  each  x.; 
The  span  is  seen  to  increase  with  x  to  account  for  the  increasing  noise, 
as  well  as  to  take  advantage  of  the  decreasing  curvature  of  f .,  (For  X  > 
0.7,  the  span  has  reached  the  largest  value  provided  in  the  program, 
k/N  -  0.7.)  Figure  1c  's  the  same  as  Figuie  1b  except  that  the  curve 
Y-f  (x)=sin[2n(l-x)2]  is  superimposed  for  reference,  The  resulting  smooth 
is  seen  to  estimate  the  underlying  f  reasonably  well,.  Note  that  for  a 
linear  function  y=ax+b  (zero  curvature)  the  smoother  will  tend  to  use  a 
constant  (maximum)  span  value  regardless  of  (the  variation  of)  the 


amplitude  of  the  noise. 


In  the  previous  sections.  Me  have  described  a  fairly  intricate 
scatterplot  smoother.  As  an  essential  building  block  of  projection 
pursuit  regression  (Friedman  and  Stuetzle,  1981),  it  has  performed  well. 
In  this  context,  the  smoother  is  applied  to  the  full  data  set  many  times 
in  a  single  run.  In  order  for  such  a  procedure  to  be  computationally 
feasible  for  large  data  sets,  it  is  necessary  that  the  smoother  be  as 
fast  as  possible.  One  possibility  to  increase  speed  is  by  binning.; 
Denote  the  observations  for  one  particular  scatterplot  by 
( x  i  ,yi  )....•  (xniyn)  ..  Ue  assume  here  that  the  observations  already  have 
been  sorted  so  that  the  Xi  are  in  increasing  order.  Choose  a  bin  size, 
say  m,  and  define  new  data  points  (Ui ,vi ) . . . (un/„, vn/*)  by 

u  i  =  mean  (x,  ?- 1 )  •«  i . . . x  i.) ; 
vi  =  mean  (y(  i-i  )aM 

Then  apply  the  smoother  described  above  to  the  (ui  ,vi ) . .  .:(un/»,  vn/*) . 
The  smooth  for  predictor  values  xj  not  among  Ui,.,un/B  can  be  obtained  by 
linear  interpolation  or,  at  the  boundaries,  by  extrapolation. 

The  computing  time  for  the  smoother  grows  linearly  in  the  number  of 
observations,  and  so  binning  reduces  the  run  time  of  the  smoother  roughly 
by  a  factor  of  m. 

Figure  2  shows  the  results  of  applying  the  smoother  to  a  sample  of 
n=500  observations  generated  from  thp  same  model  as  the  data  shown  in 
Figure  1,  with  the  results  o*  applying  the  binning  procedure  with  m=5 
super  imposed .  The  quality  of  the  smooth  is  seen  to  suffer  very  little 
while  the  computation  has  been  substantially  reduceu. 


As  for  all  data  analytic  procedures,  it  is  highly  desirable  for  a 
scatterplot  smoother  to  be  resistant  against  occasional  outliers  in  the 
data.  (All  our  analysis  is  conditional  on  the  observed  predictor  values; 
outlier  thus  means  outlier  in  response.)  The  smoother  described  in 
Sections  4  and  5  clearly  is  not  resistant.  One  way  to  overcome  this 
limitation  is  to  first  screen  the  data  with  a  rejection  rule  identifying 
outliers,  and  then  apply  the  smoother  to  the  cleaned  data  set. 

We  suggest  a  rejection  rule  based  on  running  medians.  A  running 
median  smoother  with  span  k  is  defined  by 
s(x  ; )  =  med  (*  j-k/2 . , . y ;+k/2 ) 

The  ends  of  the  sequence  must  be  treated  specially,  most  simply  by 
replicating  the  outermost  values  defined  above.  The  rejection  rule  makes 
five  passes  over  the  data:- 

1)  Compute  a  running  median  smooth  s. 

2)  Replace  s(x;)  by  s*(x,)  obtained  by  l’nearly  interpolating 

between  (x,-i»  s(x;-i))  .snd  (x;*,,  s(xiti))«  The  purpose  of 
this  step  is  to  ensure  a  more  realistic  estimate  of  spread  in 
steps  (3)  and  (4)  for  monotone  (sub)sequences,  which  are 

exactly  reproduced  by  a  running  median. 

"*)  Smooth  the  absolute  residuals  jr,|  =  |  y,  -  s*ixi)j  by  a 
running  median  and  obtain  a  sequence  v,  .  .v,,  of  local  spread 
estimates  v #  , . 

4)  Smooth  the  sequence  of  local  spread  estimates  by  a  moving 
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average  with  span  f*n  and  obtain  smoothed  spreads  v*i.  This 
makes  the  spread  estimates  more  stable.  The  same  effect  could 
be  achieved  by  increasing  the  span  of  the  running  medians  in 
Step  (3);  however,  this  would  be  more  expensive 
computationally.  In  the  code  given  in  Appendix  2,  the 
constant  f  is  set  to  0.3. 

5)  Flag  ail  observations  for  which  |  rj|  l  c-v*i.,  In  the  code,  c 
is  set  to  4.5. 

Some  details  related  to  the  treatment  of  ties  have  been  omitted.  A 
TORTRAN  subroutine  implementing  this  algorithm  is  listed  in  Appendix  2. 

The  span  for  the  running  medians  in  Steps  (1)  and  (3)  is  chosen  to  be 
Increasing  with  the  sample  si2e  n  (see  Table  2).  A  motivation  for  our 
particular  choices  is  given  in  Chapter  7.  We  use  the  same  span  in  both 
steps,  although  there  is  no  inherent  need  to  do  so. 

Figure  3a  shows  the  -esult  of  applying  the  rejection  rule  to  an 
artificial  data  set.  The  true  underlying  function  is  a  sine  uave.  The 
predictors  are  uniformly  distributed  in  [0,2a];  the  random  errors  are 
Gaussian  with  standard  deviation  0.3.  Outliers  occur  with  prot-abil-ly 
0.2;  they  uere  generated  by  adding  a  Gaussian  with  standard  deviation  2.4 
to  the  original  observation..  Observations  flagged  as  outliers  by  the 
rejection  rule  are  marked  by  squares.  Figure  3b  shows  the  results  of 
applying  the  rejection  rule  to  a  real  data  set. 


The  choice  of  span  k  <or  the  running  medians  in  Steps  .)  and  (3) 
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gives  rise  to  rather  interest  ng  questions.  Somewhat  vaguely  stated,  the 
rejection  rule  will  be  able  to  detect  extreme  outliers  as  long  as  these 
running  medians  do  not  break  down.  We  will  now  define  precisely  how  we 
measure  the  degree  of  resistance  of  a  smoother,  and  give  results  on  the 
dependence  of  the  resistance  of  a  running  median  smoother  on  the  span. 

Assume  we  want  to  smooth  a  sequence  of  length  n,  Responses  can  be 
either  "good"  or  "bad",  that  is,  good  observations  or  outliers.  We 
define  random  variables  bi...bn  by  bi  =  0  if  yj  is  good,  b ,  =  1  if  is 
an  outlier..  Assume  that  ProbCbj  =  1)  =  p  and  that  the  b(  are 
independent.  (As  noted  by  Mallows  ,1980),  the  latter  assumption  might 
not  always  be  realistic;  outliers  in  time  series  sometimes  come  in 
bursts.)  A  smoothed  value  s,  is  called  bad  if  it  can  be  made  arbitrarily 
large  by  suitable  choice  of  the  reponse  values  for  the  bad  observations, 
A  smoother  is  said  to  suffer  a  breakdown  if  one  or  more  of  the  smoothed 
values  Si  are  bad.  The  probability  that  this  happens  under  the  above 
model  for  the  b  is  called  the  breakdown  probability  of  the  smoother.  It 
will  generally  depend  on  p  and  n,  A  smoother  with  breakdown  probability 
(1-p)n  is  called  nonresistant (For  a  different  definition  of  breakdown 
probability,  see  Mallows  (1980),) 

We  will  now  devise  an  approximate  formula  for  the  span  necessary  to 
guarantee  an  upper  bound  on  the  breakdown  probability  as  a  function  of  n 
and  p  =  Prob(b,  =  1).  For  That  purpose,  we  define  new  random  van.tules 
s i .  s n -  ^ ♦  i  by 

i  +  k  - 1 

s<  '  l  bj. 

1 
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A  running  median  smoother  suffers  a  breakdown  if  among  any  consecutive  k 
observations  more  than  k/2  are  bad;  i.e.,  if  max  si  >  k/2.  This 
probability  does  not  seem  to  be  simple  to  evaluate,  but  it  is  easy  t.o 
obtain  an  upper  bound  in  terms  of  the  binomial  probability  Prob(b(x,p)  > 
k/2,  using  the  Bonferroni  inequality: 

Prob(max  si  >  k/2)  i  (n-k+1)  Prob(B(k,p)  >  k/2). 

This  inequality  provides  an  estimate  for  the  span  needed  to  guarantee  a 
certain  upper  bound  or«  the  breakdown  probability  for  given  n  and  p. 
Table  1  gives  estimates  of  the  necessary  span  k  for  breakdown  probability 
bounded  by  0.05,  and  several  values  of  n  end  p  (n=25, 50 , 1 00, 200, 400, 800; 
p=0. 05, 0 , 1 , 0 . 2) .  For  a  comparison,  we  also  list  the  percentage  of 
breakdowns  actually  observed  in  thousand  randomly  generated  Bernoulli 
sequences  for  tne  estimated  value  of  k,  and  the  smallest  value  of  k 
resulting  in  50  or  feuer  breakdowns.  The  results  show  that  the 

Bonferroni  estimate  is  close,  especially  for  p  =  0.05  and  p  =  0.;1  where 
Prob(B(k,p)  >  k/2)  is  small;  this  is  in  agreement  with  experience  gained 
in  using  the  Bonferroni  inequality  in  multiple  comparisons.;  The  span  of 
the  running  medians  in  Steps  (1)  and  (3;  of  the  rejection  rule  described 
in  Chapter  6  was  chosen  to  guarantee  a  breakdown  probability  of  less  than 
0.05  for  probability  p=0.  1  of  obtaining  an  outlier. 

Another  interesting  question  is  how  fast  the  span  k(n)  must  grow  as  a 
function  of  n  with  everything  else  fixed.  This  question  has  been 
answered  by  P,  Frdos  and  A.-  Renyi  (19/0)* 

Theorem :•  If  k(n)  -  cln  n  then 

s  i 

Pro!,  (  1  im  max  —  =  a )  -  1 , 

>•  »<»1  <  i  <n  -k  +  1  k 


wwsppwpnej* . jwwppqr^www'iw- *•*'  *»«.' 
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where  a  is  related  to  c  via  the  equation 

- 1  a  p 

—  =  vCn(p(1-p))  +  (1-p-a)  (JJn  -  -  Jin  - ), 

c  1-a  1-p 

for  a  >  c. 

This  theorem  i.  a  special  case  of  Erdos  and  Renyi's  theorem  2.-  It 
can  be  applied  to  our  situation  as  follows:  Choose  a  =  1/2  -  e . 
Then  thtjre  eyists  ar  n0  such  that  tor  all  n  >  no  ue  have  max  Sj  < 
k/2  for  almost  all  sampling  sequences.  In  addition,  Erdos  and  Renyi 
show  that 

-  If  k  grows  slower  than  In  n  (k(n)/ln  n-*0,),  then  for  a’l  but 
finitely  many  values  ot  n,  max  Si  =  k  for  almost  all  sampling 
sequences.,  ("k  cannot  grow  slower  than  In  n*\  ) 

-  If  k  grows  faster  than  In  n  (k(n)/ln  n-+«),  then  1  im  max  si=kp  for 

almost  all  sampling  sequences:  i,e.:,  the  strong  law  of  large 

numbers  applies.  ("k  does  not  have  to  grow  faster  than  In  n".-) 

6.  Aq  llpdat  i  ng  A1  gor  i  thm  for  Runn  i  ng  Hed  l  ans 

There  is  a  straightforward  way  to  compute  running  medians:  Obtain  the 
median  of  each  consecutive  k-tupel  by  sorting  it.:  That  can  be 

substantially  improved  upon  by  making  use  oi  the  fact  that  the  set  of 
responses  defining  s,«i  is  almost  the  same  as  the  set  of  responses 
defining  s : ;  only  y,n  =  yitk/20  has  to  be  added,:  and  yCut  =  y,-k/2  has 


to  be  deleted. 

The  following  rules  are  easy 

to  verify* 

-  II  y  in  =  Si 

,  then  s  ,  *  y  -  Si. 

-  If  y , n  >  s  , 

and  yOJt  )  s ,  or  if  y , „  <  s 

1  and  y„ut  <  s ,, 

then  s  ,  * 

s ,  . 
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So  in  the  case  of  random  data,  we  have  to  do  nothing  but  make  these  tests 
naif  the  time. 

-  If  yjn  >  Si  and  y0ut  <  Si,  then  let  k*  denote  the  number  of 
observations  in  the  new  span  that  are  bigger  than  si.  If  K+  <  k/2, 
then  Si  +  i  =  ,1 i ,  else  Si  +  i  is  the  smallest  observation  strictly  bigger 
than  »i.  The  analog  of  that  is  true  if  yjn  <  si  and  y0ut  >  Si. 

-  If  yi„  >  Si  and  yout  =  si>  then  define  k*  as  above.  If  k*  =  k/2, 
then  s i 4 i  is  the  smallest  observation  in  the  span  strictly  bigger 
than  Si;  el3e  Si  +  i  is  the  smallest  observation  in  the  span  that  is 
bigger  than  or  equal  to  si.  The  analog  of  that  is  true  if  yjn  <  Si 
and  Vout  =  Si. 

In  appendix  2,  we  give  a  FORTRAN  subroutine  that  implements  the  algorithm 
outlined  above.: 

For  random  data,  the  algorithm  will  take  0(nk)  operations.  It  is 
possible  to  reduce  that  to  0(n  log  k)  by  organizing  the  observations  in 
the  span  into  a  binary  tree  which  is  kept  balanced  as  observations  are 
moved  in  and  out  (AVl-tree;  see  Knuth  (1973),  pp  451),  Unfortunately, 
for  the  range  of  k  that  ue  have  in  mind  (about  20),  log  k  is  not  enough 
smaller  than  k  to  compensati  for  the  increased  overhead. 

S .  Discussion 

Cleveland  (1979)  has  suggested  a  scatterplot  smoother  also  based  on 
local  linear  fits.-  It  differs  from  the  one  described  in  this  report 
mainly  in  three  respects: 

-  K  does  not  use  variable  span, 

-  In  the  fit  of  the  local  straight  line  determing  the  smooth  s(x,)  for 


predictor  value  x;,  the  observations  are  weighted  according  to  their 
distance  from  x , •  observations  towarus  the  extremes  of  the  span 
receive  lower  weights  than  observations  with  predictor  values  close 
to  xi.  Asymptotic  calculations  suggest  that  assigning  unequal 
weights  should  reduce  the  error  of  the  curve  estimate,  but  there  is 
no  evidence  that  it  m?Aes  a  substantial  difference  for  sample  sizes 
occurring  in  practice.  It  dies,  houever,  produce  a  smoother  looking 
estimate « 

-  The  procedure  derives  i*s  resistance  properties  not  from  data 
screening  with  a  rejection  rule.  Instead,  each  of  the  local 
straight  lines  is  fitted,  not  by  least  squares,  but  by  a  resistant 
fitting  procedure. 

Updating  formulas  cannot  be  used  in  this  scheme,  making  it 
comparatively  expensive  in  terms  of  computing.;  To  reduce  computation, 
Cleveland  suggests  evaluating  the  smooth  only  for  every  n-th  predictor 
value.  The  parameter  m  here  plays  a  similar  role  as  our  bin  size;'  it 
would  be  chosen  as  a  fraction  of  k.  We  developed  our  smoothing  procedure 
because  variable  span  is  often  important,  and  because  the  use  of  updating 
formulas  dramatically  reduces  computation. 

Another  class  of  procedures  suggested  for  smoothing  are  procedures 
based  on  splines.  A  spline  function  s  of  order  J fr  with  knots  at  Zi...Zk 
is  a  function  satisfying  the  following  two  conditions; 

-  In  each  of  the  intervals  ( ,  z  1  ) ,  ( Zi ,  z  z  )  .  .  ( z*, t , *  ) ,  ( z  *  ,co) ,  s  is  a 
polynomial  of  degree  Z  -  1;, 

-  s  has  X.  -  2  continuous  deiivatives* 
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One  May  to  use  spline  functions  in  scatterpiot  smoothing  is  to  fit  a 
spline  function  with  knots  Zi.,.Zk  to  the  data  (xi  »y  i )  .• (xn,yn) ,  either 
by  least  squares  or  by  some  resistant  method..  The  degree  of  smoothness 
is  determined  by  the  number  and  position  of  the  knots.  A  major 
disadvantage  of  this  method  is  that  k+1  parameters  must  be  chosen:  the 
number  and  the  positions  of  the  knots.  Usually  some  heuristic  procedure 
is  used  to  place  the  knots  once  k  has  been  fixed  (Jupp,  1978).  This 
leaves  the  number  of  knots  to  be  determined.  This  number  plays  the  role 
of  the  span  in  determining  the  degree  of  smoothing.:  Unf or tunatel y ,  the 
output  of  the  smoother  can  depend  on  k  in  a  very  nonlinear  way;  it  is 
easy  to  construct  examples  where  the  addition  of  one  more  knot 
substantially  decreases  the  residual  sum  of  squares,  whereas  further 
knots  hardly  make  any  difference.  This  makes  k  more  difficult  to  choose 
than  the  span  in  a  local  averaging  smoother.;  Furthermore,  least  squares 
fit  of  splines  is  substantially  slower  so  that  choosing  k  through  cross- 
validation  is  usually  too  expensive. 


Another  way  is  to  use  smoothing  splines  in  the  sense  of  Reinsch 
(  1967).;  A  smoothing  spline  s  of  order  2.5  for  smoothing  parameter  X  is 
the  function  that  minimizes 


X  (y ,-f (x  * ) ) 2 


+ 


X 


fX  rt 


f fX) 2 (x)dx 


J  X  1 


among  all  functions  f  with  5  derivaties..  The  solution  really  turns  out 
to  be  a  spline  function  of  order  25  with  knots  X|> ..xn;  the  name  is  thus 
justified.  The  larger  X  is  chosen,  the  smoother  s  becomes;  thus,  X  here 
plays  the  role  of  the  spar,..  Computation  of  the  spline  for  given  A 


requires  the  solution  of  a  banded  n*n  linear  system 


A  drawback  of  the 
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method,  as  described  here,  is  that  it  is  impossible  to  -•'tain  an 
intuitive  feeling  for  the  choice  cf  X  in  a  given  example.  So,  one 
usually  fixes  not  X,  but  the  residual  sum  of  squares  around  the  smooth.; 
The  corresponding  value  of  X  then  has  to  be  found  iteratively  by 
repeatedly  solving  the  minimization  problem.  This  substantially 
increases  the  necessary  amount  of  computation..  Algorithms  to  determine 
the  optimal  X  by  cross-validation  usually  require  computation  of  the 
singular  value  decomposition  of  an  nxn  matrix;  they  are  expensive  and 
infeasible  for  sample  sizes  larger  tha;i  200-300. 

To  summarize,  the  local  averaging  smoother  described  in  this  report 
has  tuo  desirable  properties  that  set  it  apart  om  other  scatterplot 
smoothers:  it  is  very  fast  to  compute  and  the  value  of  the  parameter  tnat 
controls  the  amount  of  smoothing  is  optimized  locally  (through  cross- 
validation),  allowing  it  to  change  over  the  range  of  predictor  values.. 
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APPENOIX  I 


The  following  is  a  complete  listing  of  a  FORTRAN  -ub-out i ne 
implementing  the  smoothing  procedure  described  in  this  paper. 
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SUBROUTINE  SUPSMU  (N, X, Y, W, IPER, ALPHA, RESPAN, IBIN, SMO, SC) 


SUPER  SMOOTHER  (FRIEDMAN  AND  STUETZLE,  1982) 


CODED  BY: 


INPUT: 
N  : 


J.  H.  FRIEDMAN  AND  W.  STUETZLE 
DEPARTMENT  OF  STATISTICS  AND 
STANFORD  LINEAR  ACCELERATOR  CENTER 
STANFORD  UNIVERSITY 
STANFORD  CA.  94305 


NUMBER  OF  OBSERVATIONS  (X,Y  -  PAIRS). 


X ( N )  :  ORDERED  ABSCISSA  VALUES. 

Y(N )  :  CORRESPONDING  ORDINATE  (RESPONSE)  VALUES, 

W(N)  (OPTIONAL)  WEIGHT  FOR  EACH  (X,Y)  OBSERVATION. 

W  <  0  0  =>  ALL  OBSERVATIONS  HAVE  EQUAL  WEIGHT. 

IPER  :  PERIODIC  VARIABLE  FLAG. 

IPER=1  =>  X  IS  ORDINARY  ORDERED  VARIABLE. 

IPER=2  =>  X  IS  A  PERIODIC  (CIRCULARLY  DEFINED) 

ALPHA  :  FRACTIONAL  SMOOTHNESS  PENALITY  (  SEE  (LO) 

RESPAN  5  FRACTIONAL  SPAN  FOR  RESIDUAL  SMOOTHING 
(  L/N,  SEE  (8)  SECTION  4), 

RESPAN  v L T .  0  =>  FIXED  SPAN  SMOOTHER  WITH  FRACTIONAL 

SPAN  =  ABS(RE.SPAN)  . 

BINNING  FACTOR  (M,  SEE  SECTION  7),. 


VA'  ABLE, 
SECTION  4 


IBIN  : 
OUTPUT: 

SMO ( N ) 
SCRATCH: 
SC( 3,N) 


SMOOTHED  ORDINATE  (RESPONSE)  VALUES. 


INTERNAL  WORKING  STORAGE. 


C  NOTE: 

C  ALPHA=0 .  1  AND  RESPAN=0.25  AR”  REASONABLE!  VALUES.  FOR  RESPAN  >  0 

C  SMOOTHER  OUTPUT  IS  COMPLETELY  CROSS-VALI DATED;  X(I),  Y(H,  AND 

C  W(I)  ARE'  NOT  USED  IN  THE  CALCULATION  OF  SMO  (  1  )  .  THEREFORE, 

C  THE  AVERAGE  SQUARED  RESIDUAL 
C  N 

C  ASH  -  SUM  W  (  I  )  *  (  Y  (  I  )  -  SMO  (  I  )  )  *  *  2 

C  1  =  1 

C  CAN  BE  USED  AS  A  GOODNKSS-OF-F IT  MEASURE  FOR  THE  PURPOSE  )F 
C  SELECTING  OPTIMAL  VALUES  FOR  SMOOTHING  1 ARAMETERS  BY 
C  REPEATED  APPLICATION . 

C 

C  FOR  SMALL  SAMPLES'  (N  <  40)  OR  IF  THERE  ARE  SUBSTANTIAL  SERIAL 

C  CORRELATION'S  BETWEEN  OBSERATLONS  CLOSE  IN  X  -  VALUE,  THEN 

C  A  PRESPECIFIED  FIXED  SPAN  SMOOTHER  (RESPAN  <  ))  SHOULD  BE 

C  USED.  REASONABLE  SPAN  VALUES  ARE  0.3  ,GE,  ABS  ( RESPAN )  :.GK,  0.5. 
C 

C - -  -  - - - - - 

DIMENS  ION  X l N ) , Y ( N ) , W ( N ) , SMO 1 N ) , SC ( 3 , N ) 

DOUBLE  PRECISION  SX  (  5  )  ,  SY  (  5  )  ,  SXX  (  5  )  ,  SXY  (  5  )  ,  SUM  (  5  )  ,  FBW(  1.  \ 
DIMENS  ION  I  BW  (  5  )  ,  KESQUE’ (  5  ,  10  1  5  ,  SMOQUE  ?  5,  101  ) 

INTEGER  IN, OUT 

DATA  I  BW  1 ,  S' P UMAX  /  3 ,  O'.,  3  5/ 

UAT\  BIG',  EPS  M  OFPO,  !  ,OE-0  3  / 
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IF  (W(l) .LT.O.O)  GO  TO  20 
DO  10  1=1, N 
SC ( 3 , I ) =W ( I ) 

10  CONTINUE 
GO  TO  40 

20  DO  30  1=1, N 
SC ( 3 , I ) =1 . 

30  CONTINUE 

40  IF  ( X ( N ) . GT . X ( 1 ) )  GO  TO  70 
SX(1)=0.0 
FBW ( 1 ) =SX { 1 ) 

DO  50  J=1 , N 

SX(l)®SX(l)+SC(3. J)*YiJ) 

FBW ( 1 ) =FBW ( 1 ) +SC ( 3 , J ) 

50  CONTINUE 

A=SX ( 1 ) / FBW ( 1 ) 

DO  60  J=1 , N 
SMO ( J ) =A 
60  CONTINUE 
RETURN 
70  I=N/4 

J=3*  I 

SCALE=X ( J ) -X ( I ) 

80  IF  ( SCALE  ..GT..  0.0)  GO  TO  SO 
IF  (J.LT..N)  J=J  +  1 
IF  (  I  •.  GT,.  1  )  1  =  1-1 
SCALE=X ( J ) -X ( I ) 

GO  TO  80 

90  VSML= ( EPS* SCALE ) *  *  2 

IF  ( I B I N . LE.  L )  GO  TO  1 ' J 
NA=0 

SX(  L  )  =  0  0 
SY ( 1 ) =SX ( 1 ) 

FBW ( 1 )=SY( 1 ) 

DO  100  0=1, N 
S a ( L )=SX( L )+X(Jj*SC( J, J) 

SY ( 1 ) =SY ( l )+Y(J)*SC.(  3,o) 

FBW ( I ) =FBW ( 1 )+SC( 3, J  J 

IF  ( MOD ( J , I B 1 N ) , NE . 0 )  GO  TO  xOO 

NA=NA+ 1 

SC( 1 , NA) =SX ( 1 ) / FBW ( 1 ) 

SC ( 2 , NA 1 =SY ( i ) /FBW ( l ) 

SC( 3, NA)=FBW( L ) 

SX ( 1 ) =0 . 0 
SY ( 1 ) =SX ( 1 ) 

FBW ( 1 ) =SY ( I ) 

LOO  CONTINUE 

IF  |(  MOD ( N  ,  I  B I N  )  ...  EC* •  0  )  GO  TO  130 
NA-NA+ I 

SC(  L  N A )  =  S X  {  i  J/FBWf  1  ) 

SC  (2  NA  )  --SY  (  1  )  /  FBW  (  1  ) 

SC(  3,NA)=FBW(  1  } 

GO  'l\y  1  JO 
110  NA=N 

DO  130  1  =  1,  N 
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SC(1, J)=X(J) 

SC { 2 , J )=Y ( J  ) 

120  CONTINUE 
130  IBW ( 1 ) =IBW1 

IDELTA= ( SPNMAX*NA-IBW{  1 ) ) /4 . 0+0 . 5 
DO  140  1=2,5 

IBW ( I ) =MIN0 ( NA/ 2 , I BW { I - 1 ) +1 DELTA ) 

140  CONTINUE 

DO  150  1=1,5 
EX { I ) =0 . 0 
SY { I ) =SX { I ) 

SXX ( I ) =SY ( I ) 

SXY ( I ) =SXX ( I ) 

FBW  ( I  )  =>’XY  ( I ) 

SUM( I )=F3W( I ) 

IF  ( MOD ( 1 BW ( I ) , 2 ) . EQ . 0 )  IBW( I )=IBW( I )+l 
150  CONTINUE 

IF  ( RESPAN. GE. 0. 0)  GO  TO  160 
IBWS=1 

IBW ( 1 ) =0 . 5* ABS ( RES PAN ) *NA 

IF  ( MOD ( I BW ( 1 ) , 2 ) . EQ . 0 )  IBW{ 1 ) =IBW( 1 )+l 

IBW( 5 )=IBW( 1 ) 

CO  TO  170 
160  I BWS=5 

170  IF  ( I  PER . NE . 2 )  GO  TO  220 
IT=NA-I BW  v  5 )  +  1 
IH=IBW ( 5 ) - 1 
DO  190  J=1T , NA 
DO  180  1=1 , IBWS 

IF  ( J . LT . N A- I BW ( I ) + 1 )  GO  TO  180 
XT=SC ( 1 , J ) 

YT=SC ( 2 i J ) 

WT=SC ' 3 , J ) 

SX( I ) =SX ( 1 ) +XT*WT 
SY(I )=SY(I )+YT*WT 
SXX(I)=SXX( 1 ) +XT*XT*WT 
SXY(  I  )=SXY ( I )+XT*YT*WT 
FBW ( I )=FBW( I ) +WT 
180  CONTINUE 
190  CONTINUE 

DO  210  J=1 , IN 
DO  2U0  1=1, IBWS 
IF  ( J  .  GT .  i BW ( I ) - 1 )  GO  TO  200 
XT"SC ( 1 , J  ) 

YT=SC  ('/>  ,J } 

WT=SC { 3 , J ) 

SX ( I ) =Sa ( I ) +XT*WT 
S Y ( I )=SY ( I ) +YT* WT 
SXX (I )=SXX( I )+XT*XT*WT 
SXY ( I ) =SXY ( I ) +  XT*  YT*WT 
FBW ( 1 ) =FBW ( I ) +WT 
200  CONTINUE 
210  CONTINUE' 

GO  TO  230’ 

220  I T-=2  *  1  BW  (  5  )  -  1 


iM.SJlwfl«"."U*fl 


TPTWE9* 
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T 


DO  240  J=l, IT 
DO  230  1=1 , IBWS 

IF  (J.GT.2*IBW(I)-1)  GO  TO  230 
XT=SC ( 1 , J ) 

YT=SC ( 2 , J ) 

WT=SC (  3  ,  J  ) 

SX ( I ) =SX ( I ) +XT*WT 
SY ( I ) =SY ( I ) +YT*WT 
SXX ( I ) =bXX ( I ) +XT*XT*WT 
SXY ( I ) =SXY ( I ) +XT* YT*WT 
FBW ( I ) =FBW ( I ) +WT 
230  CONTINUE 
240  CONTINUE 

250  KBW=MIN0 (101, 2* INT { 0 . 5*RESPAN*NA+0 . 5)+l) 

KBW02=KBW/2+l 

IH=0 

JT=IH 

JM=JT 

DO  370  J=1,NA 
RESMIN=BIG 

IF  (J.LT.KBW02)  GO  TO  260 

JT=JT+1 

JM0=JM 

JM=MOD ( JM , KBW) + 1 
260  lll=MOD  ( IH ,  KBW )  +  1 
DO  310  1  =  1 , 1 BWS 
IF  (IBWS.NE.5)  GO  TO  270 
XT=SC ( 1 , J) 

YT=SC ( 2 , J ) 

WT=SC ( 3 , J ) 

SX  (  I  )  =  SX  ( I  ),  -  XT*  WT 
SY  ( I  )  =  SY  (  [  )  --YT*  WT 
SXX ( I ) =SXX ( I ) ~XT*XT*WT 
SXY ( I )=SXY ( I ) -XT*YT*WT 
FBW (I )=FBW(I)-WT 
270  OUT=J-IBW{ I ) 

IN=vJ  +  IBW(  t  J  -  L 

IF  UlPER.-NE.  2  )  .  AND.  (OUT.  LT.  ]  .OR.  IN.GT.  NA)  )  GO  TO  2Bu 
1 F  '( OUT  ,  LT  .  1  )  OUT=NA+OUT 
IF  (  IN GT  NA )  IN  =  IN-NA 
XT=SC ( 1 , OUT ) 

YT=SC( 2 , OUT ) 

WT=SC  ('  3  ,  OUT  ) 

SX  '(  I  /=SX(  L  )  -  XT*  WT 
SY ( I ) =SY ( 1 ) -YT*WT 
S  XX  {  I  1=S'XX|  I  \  -XT*X1'*WT 
SXY ( I ) =SXY ( l ) -XT* YT*WT 
FBW ( I ) =FBW ( I ) - WT 
XT=SC ( 1 , IN ) 

YT=SC ( 2 ,  IN  5 
WT=SC'(  3,  IN); 

SX(  1  )'-SX(  I  J  +XT*  WT 
SY ( I  J  =  S  Y ( I ) t  YT* WT 
SXX(  1  )---SXX(  1  j  fXT*X  !’* W’l' 

SXY  (  I  .)  =SX  Y  I  }  *  XT*  YT*WT 
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FBW { I ) =FBW ( I ) +WT 
280  D=SXX(I)-SX{I)**2/FBW(I) 

VAR=D/FBW(I) 

A=0 . 0 

IF  (VAR.GT. VSML)  A= (SXY ( I ) -SX ( I ) *SY( I ) /FBW( I ) ) /D 
SM=A*  SC ( 1 , J )  +  ( SY ( I ) -A*  SX ( I ) ) /FBW ( I ) 

IF  (IBWS.NE.l)  GO  TO  290 
SMO ( J ) =SM 
GO  TO  320 

290  RES=SC  (  3  ,  J  )  *  ( SC  (  2  ,  J  )  -  SM)  '"*2 

IF  (J.GT.KBW)  SUM{ I )=SUM( I ) -RESQUE ( I , IH ) 

SUM( I )=SUM( I )+RES 
RESQUE (I, IH)=RES 
SMOQUE(I, IH)=SM 

IF  (VAR. LT. VSML. AND. I. LT. 5)  SMOQUE ( I , IH )=BIG 
IF  (J.LT.KBW02)  GO  TO  300 
SUM ( I ) =SUM ( I ) -RESQUE ( I , JM ) 

IF  (JT.GT.l)  SUM ( I ) =SUM ( I ) +RESQUE ( I , JMO ) 

IF  ( SUM ( I ) . GT . RESMIN . OR . SMOQUE ( I , JM ) . GE . BIG )  GO  TO  300 
RESMIN=SUM ( I ) 

IS=I 

300  XT=SC ( 1 , J  ) 

YT=SC ( 2 , J  ) 

WT=SC ( 3 , J  ) 

SX(  I  )-~SX(  I  )+XT*WT 
SY ( I ) =SY ( I )+YT*WT 
SXX ( I ) =  SXX ( I ) +XT*XT*WT 
SXY ( I ) =  SXY ( I ) +XT* YT*WT 
FBW ( I ) =FBW ( I ) +WT 
310  CONTINUE 

320  IF  (IBWS.EQ.l)  GO  TO  370 

IF  (J.GE.KBW02)  SMO ( JT ) =SMOQUE ( IS , JM , 

IF  ( ALPHA . LE .0.0. OR. J . LT . KBW02 . OR . IS . GL . 5 )  GOTO  370 
RESMIN- ( 1 . 0+ALPHA) * RESMIN 
1  =  5 

GO  TO  340 
330  I  =  IM-1) 

340  IF  ( (-1 )* ( (I )-(IS) ) .GT.O)  GO  TO  350 
IF  (SUM( I ) .GT. RESMIN)  GO  TO  330 
350  IF  (I. GE. 5)  GO  TO  360 

A= (RESMIN-SUM(I) ) / ( SUM ( 1  + 1 ) -SUM ( I )  ) 

SMO  (JT)  =  (  1-.-0-A)  *SMOQUE {  l ,  JM )  +  A*  SMOQUE ( I  + 1 ,  JM ) 

GO  TO  370 

360  SMO ( JT ) =SMOQUE ( I , JM) 

370  CONTINUE 

IF  (IBWS.NE.5)  GO  TO  440 
JT=JT+1 

DO  430  J-JT,NA 
IH=MOD(  III,  KBW)  +  ) 

RESMIN=B IG 
JM0--JM 

JM=MOD(  JM,  KBW )  4-1 
DO  380  1-1,5 

SUM  j,  I  )=SUtl(  I  )  -  RESQUE  ( I ,  IH )  +RESQUE  ( I ,  JMO  ) -RESQUE  (  I,  JM) 
IF  iSUM(i)  .GT.  RESMIN.  OR .:  SMOQUE  (  I,  JM)  .GE.  BIG)  GOTO  380 


RESMaN=SUM ( I ) 

IS  =  I 

380  CONTINUE 

SMO ( J ) =SMOQUE ( IS , JM ) 

IF  {ALPHA. LE. 0.0. OR. IS.GE. 5)  GO  TO  430 

RESMIN= { 1 . 0+ALPHA ) ‘RESMIN 

1=5 

GO  TO  400 
390  I=I+(-l) 

400  IF  ( ( -1 ) * ( (I)-(IS) ) .GT.O)  GO  TO  410 
IF  ( SUM { I ) . GT . RESMIN )  GO  TO  390 
410  IF  (I.GE.5)  GO  TO  420 

A= ( RESMIN -SUM (I))/ (SUM(I  +  1) -SUM ( I )  ) 

SMO ( J ) = ( 1 . O-A ) *  SMOQUE ( I , JM ) +A*SMOQUE ( 1+1 , JM ) 
GO  TO  430 

420  SMC ( J ) =SMOQUE ( I , JM ) 

430  CONTINUE 
440  IT=NA-1 

S2=SMO ( 1 ) 

IF  (IPER.NE.,2)  GO  TO  450 
A=S2 

SMO ( 1  )  =0 . 2 5*  ( SMO (NA)  +  2 . 0*S2  l-SMO ( 2 ) ) 

GO  TO  460 

450  SMO  ( I  )  =  0. 25*  (  2 . 0*S2+3 . 0*SMO( 2  )  (  3  )  ) 

460  DO  470  J  =  2 , IT 
S 1  =S  2 
S2=SMO ( J ) 

SMO (J )=0. 25* { SI  +  2 . 0*S2+SMO (J  +  l ) ) 

470  CONTINUE 

[F  ( IPER.NE. 2 )  GO  TO  480 

SMO  ( NA  J  =  0 .,  25*  ( A+2  .;  0*SMO  ( NA )  +S2  ) 

GO  TO  490 

480  SMO (NA J=0. 2  5* (2 . 0*SMO { NA )  +  3 . 0*S2-S 1 ) 

490  IF  (i»IN,LE.l)  GO  TO  550 
DO  500  1=1, NA 
SC '(  2  ,  I  )  -SMO  (  I  ) 

500  CONTINUE 

XUP=SC( 1 , 1  )-l . 

J  =0 

DO  540  1=1, N 
XI=X(I  ) 

IF  (XI . LL . XUP J  GO  TO  530 
J=J+1 

XLOW=SC ( l , J ) 

XUP=SC( 1, J+l ) 

YLuW=SC( 2, J ) 

YUP=SC( 2, J+l ) 

IF  (  XLOW  .  NE  .-  XUP  |  GO  TO  510 
SLOPH=0. 

GO  TO  520 

510  S  LO  PE=(YUP-Y LOW ) /(XUP-XLOW) 

520  IF  (J+l . Eu - NA |  XUP=X(Ni 
530  SMO (  1  1-YLOW r (Xl-XLOW) ‘SLOPE 
540  CONTINUE 
550  J=1 
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560  J0=J 

SY{ 1 )-SMO( J ) 

IF  (W(l) .LE.0.0)  GO  TO  570 
S Y ( 1 ) — W ( J ) *  SMO ( J ) 

FBW ( 1 ) =W ( J ) 

570  IF  ( J . GE . N )  GO  TO  610 
580  IF  (X(J+1 ) .GT.X(J) )  GO  TO  610 
J=J+1 

IF  (W(J) .GT.0.0)  GO  TO  590 
SY ( 1 ) =SY ( 1 ) +SMO ( J ) 

GO  TO  600 

590  S Y ( 1 ) =S Y ( 1 ) +W ( J ) *  SMO ( J ) 

FBW ( 1 ) =FBW ( 1 ) +W ( J ) 

600  IF  (J.LT.N)  GO  TO  580 
610  IF  (J.LE.JO)  GO  TO  630 

IF  (W(-l). LE.0.0)  FBW ( 1 ) =J-J0+1 
SY ( 1 ) =S Y ( 1 ) / FBW ( 1 ) 

DO  620  I=J0,J 
SMO ( I ) =SY ( 1 ) 

620  CONTINUE 
630  J=J+1 

IF  (J.LE.N)  GO  TO  560 

RETURN 

END 
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APPENDIX  II 


The  following  is  a  complete  listing  of  a  FORTRAN  subroutine 
implementing  the  rejection  rule  described  in  this  paper. 


o  c.  r.  r.  t ~  c.  o  c. 
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SUBROUTINE  REJECT  (PRED,RESP,N, WEIGHT, SCRAT) 

c - 

C 

C  REJECTION  RULE  FOR  SMOOTHING  (FRIEDMAN  AND  STUETZLE,  1982) 

C 

C  CODED  BY:  J.  H.  FRIEDMAN  AND  W.  STUETZLE 
C  DEPARTMENT  OF  STATISTICS  AND 
C  STANFORD  LINEAR  ACCELERATOR  CENTER 
C  STANFORD  UNIVERSITY 
C  STANFORD,  CA.  94305 

C 

C  INPUT: 

C  PRED(N) 

RESP(N) 

N 

OUPUT: 

WEIGHT (N)  : REJECTION  FLAGS. 

WEIGHT ( I ) =0  IF  OBSERVATION  I  IS  CONSIDERED  AN  OUTLIER 
WEIGHT ( I ) =1  OTHERWISE 

SCRATCH: 

SCRAT (N, 2) : INTERNAL  WORKING  STORAGE 

C 

C  NOTE: 

C  REJECT  USES  SUBROUTINE  RUNMED  (SEE  BELOW) 

y 

C - 

DIMENSION  PRED(N) ,RESP(N) , WEIGHT ( N ) ,SCRAT(N,2) 

DATA  FACT/ 4. 5/ 

DATA  RELSPA/O.j/ 

IF  (N.GT.25)  GO  TO  10 
I BAND-7 
GO  TO  50 

10  IF  (N.GT. 100)  GO  TO  20 
IBAND=9 
GO  TO  50 

20  IF  (N.GT. 400)  GO  TO  30 
I BAND-11 
GO  TO  50 

30  IF  (N.GT. 800)  GO  TO  40 
IBAND-13 
GO  TO  50 

40  IBAND=1 5 

50  CALL  RUNMED  ( RESP , WEIGHT , N, I  BAND ) 

IFIRST=IBAND/2+l 
lLAST=N-IBAND/2 
DO  60  1=1 , I FIRST 
SCRAT ( I , 1 )=WEIGHT(I) 

60  CONTINUE 

DO  70  1=1 LAST, N 
SCRAT ( I , 1 ) -WEI GUT U ) 

70  CONTINUE 


ABSCISSA  VALUES  IN  INCREASING  ORDER 
CORRESPONDING  ORDINATE  (RESPONSE)  VALUES 
NUMBER  OF  OBSERVATIONS  (X.Y-PAIRS) 
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DO  90  I =1 FIRST, I LAST 

IM1 =1-1 

IP1=I+1 

IF  (PRED(IMl) .NE.PRED(IPl) )  GO  TO  80 
SCRAT ( I , 1 ) =0 . 5* ( WEIGHT ( IM1 ) +WEIGHT ( IP1 ) ) 

GO  TO  90 

00  SCRAT(I, 1)=WEIGHT(IM1 )  + (WEIGHT (IP1) -WEIGHT (IM1 ) } * ( PRED ( I ) -PRED ( IM 

1) )/(PRED(IPl)-PRED(IMl) ) 

90  CONTINUE 
1=0 

IOC  IF  (l.GE.N-l)  GO  TO  150 
1=1+1 
M0=I 

110  IF  ( PRED ( 1+1 ) . GT . PRED ( I ) )  GO  TO  120 
1=1+1 

IF  (I.LT.N)  GO  TO  110 
120  IF  (I.EQ.MO)  GO  TO  100 
NTIE=I -M0+1 
R=0 . 

DO  130  J=MO, I 
R=R+SCRAT ( J , 1 ) 

130  CONTINUE 
R=R/NTIE 
DO  143  J=M0 , I 
SCRAT ( J , 1 )=R 
140  CONTINUE 
GO  TO  100 
150  DO  160  1=1, N 

WEIGHT ( I ) =ABS ( RESP ( I ) -SCRAT ( 1 , 1)  1 
160  CONTINUE 

CALL  RUNMED  (WEIGHT, SCRAT ( 1 , 1 ), N, I  BAND) 

IS2=N* RELSPA/ 2 . 

SUM=0. 

DO  170  1=1, IS2 
SUM=SUM+SCRAT ( I , 1 ) 

170  CONTINUE 
ISEFF=IS2 
DO  200  1=1, N 

IF  (I.GT.N-I&2)  GO  TO  100 
SUM=SUM+ SCRAT ( I+IS2 , 1 ) 

I SEFF=I SEFF+1 

180  IF  (I.LE.IS2+1)  GO  TO  190 
SUM=SUM-SCkAT( I-1S2-1, 1 ) 

ISEFF=ISEFF- I 
1 90  SCRAT (1,2* =SUM/ ISEFF 
200  CONTINUE 
1=0 

210  IF  (I.GE.N-1)  GO  TO  260 
1  =  1  +  1 
M0=I 

220  IF  ( PRED ( I + 1 ) , GT . PRED ( I ) )  GO  TO  230 
1  =  1+  L 

IF  (I.LT.N)  GO  TO  '2  20 
230  IF  (1.-EQ.MO)  GO  TO  210 
NT IE= I -M0+ l 
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R=0. 

DO  240  J=M0, I 
R=R+SCRAT( J, 2 ) 

240  CONTINUE 
R=R/NTIE 
DO  250  J=M0 ,  I 
SCRAT ( J, 2 )=R 
250  CONTINUE 
GO  TO  210 
260  DO  280  1=1 ,  N 

IF  ^WEIGHT(I) .LE. FACT* SCRAT (I, 2) )  GO  TO  270 
WEIGHT ( I )=0. 

GO  TO  280 
270  WEIGHT ( I ) =1 • 

200  CONTINUE 
RETURN 
END 

C - -> - 

c 

c 

SUBROUTINE  RUNMEU  ( SEQ , SMO , N , IBAND ) 

C - 

C 

C  FAST  RUNNING  MEDIAN  FINDER  (FRIEDMAN  AND  STUETZLE,  1982). 

C 

C  CODED  BY  :  J.  H.  FRIEDMAN  AND  W.  STUETZLE 
C  DEPARTMENT  OF  STATISTICS  AND 

C  STANFORD  LINEAR  ACCELERATOR  CENTER 

C  STANFORD  UNIVERSITY 

C  STANFORD,  CA.  94305 

C 
C 

C  INPUT: 

C  SEQ(N)  : RESPONSES  IN  ORDER  OF  INCREASING  PREDICTOR  VALUES 

C  N  : NUMBER  OF  OBSERVATIONS 

C  IBAND  : SPAN  OF  RUNNING  MEDIANS  (HAS  TO  BE  ODD  AND  <=21 ) 

C 

C  OUTPUT: 

C  SMO(N)  : SMOOTHED  RESPONSES 

C 

C  NOTE: 

C  THE  MAXIMAL  SPAN  CAN  BE  INCREASED  BY  INCREASING  THE  DIMENSION 
C  OF  THE  ARRAYS  SCRAT  AND  ITAG 
C 

C - 

DIMENSION  SEQ (N) , SMO (N ) 

DIMENSION  SCRAT (21), ITAG (21) 

DATA  RIN//1.E20/ 

DO  10  1=1, IBAND 
SCRAT (I )- SEQ ( I ) 

1 T  \Q  ( I  )  =  1 

i()  CONTINUE 

RMIN=SCRAT ( l \ 

IMIN= 1 

DO  20  1=2, IBAND 
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IF  (SCRAT(I) .GE.RMIN)  GO  TO  20 
RMIN=SCRAT( I ) 

IMIN=I 

20  CONTINUE 

TEMP=SCRAT ( 1 ) 

SCRAT ( 1 ) =RMIN 
SCRAT ( IMIN ) =TEMP 
ITAG ( 1 )~1MIN 
ITAG ( IMIN ) =1 
1=3 

GO  TO  40 
30  1=1+1 

40  IF  ( (I) .GT. (IBAND) )  GO  TO  60 

IF  ( SCRAT ( I )  . GE . SCRAT ( I - 1 ) )  GO  TO  30 
TEMP=SCRAT ( I ) 

ITEMP=ITAG(I) 

J=I 

50  SCRAT  ( J  )  =SCRAT  ( J--1 ) 

ITAG ( J ) =ITAG ( J-l ) 

J=J-1 

IF  ( SCRAT ( J-l ) .GT. TEMP)  GO  TO  50 
SCRAT (J)=TEMP 
ITAG ( J ) =ITEMP 
GO  TO  30 

60  I RAND 2=1 BAND/ 2+1 

RMED=SCRAT ( IBAND2 ) 

DO  70  1  =  1 , IBAND2 
SMO ( I ) =  RMED 
70  CONTINUE 
IFI RST=2 
ILAST=IBAND  1-1 
ISMO=IBAND2+l 
TMED=RMED 

80  YIN=SEQ ( I LAST ) 

YOIJT=SEQ(  IFIRST-1 ) 

IF  '( YIN.  GE.  RMED)  GO  TO  180 
IF  (YOUT.GE.RMED)  GO  TO  90 
RNEW=RMED 
GO  TO  290 

90  IF  ( YOUT. LE. RMED)  GO  TO  120 
KMINUS=0 
RNF.W=-RINF 

DO  110  I =1 FIRST, I LAST 
ST=SEQ(I  ) 

IF  (SI-.LT.  RMED )  GO  TO  100 
GO  TO  110 

100  KMINUS=KMINUS+1 

IF  (SI.LE.RNEW)  GO  TO  110 
RNEW=Si 
110  CONTINUE 

IF  {  KMINUS  .  GE  ..  I  BAND  2  )  GO  TO  290 
RNEW=RMED 
GO  TO  290 
120  KM1NUS=0 
RTS----R1 NT 


»."U  PJWUI. 
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RSE=-RINF 

DO  160  I=IFIRST, ILA3T 
SI=SEQ ( I ) 

IF  (SI.LE.RMED)  GO  TO  130 
GO  TO  160 

130  IF  (SI.GE.RMED)  GO  TO  150 
KMINUS=KMINUS+1 
IF  (SI.LE.RTS)  GO  TO  140 
RTS=SI 

140  IF  (SI.LE.RSE)  GO  TO  160 
RSE=SI 
GO  :iO  160 
150  RSE-SI 
160  CONTINUE 

IF  ( KMINUS . EE . IBAND2 )  GO  TO  170 
RNEW=RTS 
GO  TO  290 
170  RNEW=RSE 
GO  TO  290 

180  IF  (YIN.LE. RMED)  GO  TO  280 
IF  (YOUT.LE. RMED)  GO  TO  190 
RNEW=RMED 
GO  TO  290 

190  IF  (YOUT.GE. RMED)  GO  TO  220 
KPLUS=C 
RNEW=RINF 

DO  210  I -I FIRST, I LAST 
SI=SEQ(I  ) 

IF  (SI.GT.RMED)  GO  TO  200 
GO  TO  210 

200  KPLUS=KPLUS+1 

IF  (SI.GE.RNEW)  GO  TO  210 
RNKW=SI 
210  CONTINUE 

IF  ( KPLUS . GE . I BAND 2 )  GO  TO  290 
RNEW=RMED 
GO  TO  290 
220  KPLUS=0 
RTB=RINF 
RBE=RI 

DO  260  .FIRST, I LAST 
SI=SEQ ( 1 ) 

IF  (SI.GE.RMED)  GO  TO  230 
GO  TO  260 

230  IF  (SI.LE.RMED)  GO  TO  250 
KPLUS=KPLUS+1 
IF  (Sl.GE.RTB)  GO  TO  240 
RTB-SI 

240  IF  (SI.GE.RBE)  GO  TO  260 
RBE=S1 
GO  TO  2b0 
250  RBE=S1 
260  CONTINUE 

IF  ( KPLUS . NE . 1BAND2 )  GO  TO  270 
RNEW-RTB 


GO  TO  290 
270  RNEW=RBE 
GO  TO  290 
280  RNEW=RMED 
290  RMED=RNEW 

SMO ( ISMO ) =RMED 
IFIRST=IFIRST+1 
ISMO=ISMO+l 
ILAST=ILAST+1 
IF  ( I LAST . LE . N )  GO  TO  80 
DO  300  I -ISMO, N 
SMO ( I ) =RMED 
300  CONTINUE 
RETURN 
END 


a 

i 
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Figure  Captions 


Figure  la:  Two  hundred  observations  (points)  drawn  from  the  model 

Y=sin[2u(1-X)*3  +  X«  with  €  iid  standard  normal. 

Figure  lb:  The  data  of  Figure  la  with  the  computed  smooth 

superimposed.  The  height  of  the  bottom  curve  is  proportional  to 
the  span  value  employed  at  the  corresponding  abscissa  value. 

Figure  1c:  Same  as  Figure  1b  with  the  addition  of  the  curve 

Y=sin[2rr(  1-X)23 


Figure  2:  Five  hundred  observations  from  the  same  model  as  Figure 

1,  with  the  computed  smooths  for  both  m=1  and  m=5.: 

Figure  3a:  Output  of  rejection  rule  applied  to  artificial  dati.  set. 
Rejected  observations  are  marked  by  squares. 

Figure  3b:  Output  of  rejection  rule  applied  to  real  data  set. 
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TABLE  I 


n 

Pbad 

K 

Jtbds 

K 

25 

0.05 

5 

1.0 

5 

50 

0.05 

7 

0.5 

5 

100 

0.05 

7 

1.0 

7 

200 

0.05 

7 

2.0 

7 

400 

0.05 

9 

1.0 

7 

800 

0.05 

9 

2.0 

9 

25 

0.  1 

9 

0.5 

7 

50 

0,1 

9 

2.3 

9 

100 

0.  1 

11 

1.4 

9 

200 

0.  1 

13 

0.5 

11 

400 

0.  1 

13 

1,0 

1 1 

800 

0.  1 

15 

0.0 

13 

25 

0.2 

15 

1.9 

13 

50 

0,2 

21 

0,7 

15 

100 

0.2 

23 

1,4 

19 

200 

0.2 

27 

1.8 

21 

400 

0.2 

31 

1,  9 

27 

800 

0.2 

33 

2.1 

31 

n:  length  of  sequence 

Pb»d :  probability  of  an  outlier 

K:  Bonferroni  estimate  of  span  necessary  to  guarantee 

breakdoun  probability  <  0.05 

Jibds:  Percentage  of  breakdoun  actually  observed  in  1000  Mcnte 
Carlo  trials  for  span  K.: 

K :  Span  necessary  to  guarantee  breakdown  probability  <  0.05 

(estimated  from  1000  Monte  Carlo  trials).. 


TABLE  II 


_ _  K 

i  25  7 

£100  9 

£400  }1 

<800  13 

>800  ,5 


’ength  of  sequence 

span  of  running  medians  in  steps  (1)  and  (3)  of  rejection 
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